program: exit(0) directly in terminateWithSignal to avoid runtime exit(2)#337
Closed
aron-muon wants to merge 1 commit into
Closed
program: exit(0) directly in terminateWithSignal to avoid runtime exit(2)#337aron-muon wants to merge 1 commit into
aron-muon wants to merge 1 commit into
Conversation
212e498 to
ec57cf3
Compare
Member
|
Even though returning 'exit 0' may be acceptable for daemons, |
Per buildbarn#337 review: returning exit 0 on graceful signal-triggered shutdown is correct for daemons (k8s pod phase Succeeded, systemd inactive (dead) without failure), but wrong for one-shot CLI tools (bb_copy, sync_jwks_to_configmap, etc.) where Ctrl-C should leave a non-zero status so wrapper scripts know the work was interrupted mid-run. Add a variadic Option to RunMain. The default behaviour is the one-shot semantic (exit 128+signal, the POSIX convention that bash, zsh, systemd, and kubelet all derive from WIFSIGNALED). Daemons opt in via program.WithDaemonExit() for the exit-0 behaviour. terminateWithSignal still skips the signal.Reset() + process.Signal() self-raise dance — that path is what triggered the runtime dieFromSignal exit(2) trap on multi-goroutine programs running as PID 1 in a PID namespace, and it adds nothing now that we exit with the conventional code directly. The PID-1 reaper in relaunchIfPID1 takes the same daemon flag so its exit-code policy matches the original launch. Migrate the daemons in this repo: bb_storage and bb_replicator. Leave bb_copy and sync_jwks_to_configmap unchanged so they exit 128+signal on interruption. bb-remote-execution daemons (bb_worker, bb_scheduler, bb_runner, bb_noop_worker, bb_virtual_tmp) and external one-shots like bb_autoscaler will pick the variant matching their lifecycle when they bump the dependency. Refs: - golang/go#19326 - golang/go#46321
ec57cf3 to
dda1059
Compare
Contributor
Author
Yeah, makes sense - new approach is in, thoughts on self selecting whether running as |
aron-muon
added a commit
to aron-muon/bb-storage
that referenced
this pull request
May 15, 2026
Per buildbarn#337 review: returning exit 0 on graceful signal-triggered shutdown is correct for daemons (k8s pod phase Succeeded, systemd inactive (dead) without failure), but wrong for one-shot CLI tools (bb_copy, sync_jwks_to_configmap, etc.) where Ctrl-C should leave a non-zero status so wrapper scripts know the work was interrupted mid-run. Add a variadic Option to RunMain. The default behaviour is the one-shot semantic (exit 128+signal, the POSIX convention that bash, zsh, systemd, and kubelet all derive from WIFSIGNALED). Daemons opt in via program.WithDaemonExit() for the exit-0 behaviour. terminateWithSignal still skips the signal.Reset() + process.Signal() self-raise dance — that path is what triggered the runtime dieFromSignal exit(2) trap on multi-goroutine programs running as PID 1 in a PID namespace, and it adds nothing now that we exit with the conventional code directly. The PID-1 reaper in relaunchIfPID1 takes the same daemon flag so its exit-code policy matches the original launch. Migrate the daemons in this repo: bb_storage and bb_replicator. Leave bb_copy and sync_jwks_to_configmap unchanged so they exit 128+signal on interruption. bb-remote-execution daemons (bb_worker, bb_scheduler, bb_runner, bb_noop_worker, bb_virtual_tmp) and external one-shots like bb_autoscaler will pick the variant matching their lifecycle when they bump the dependency. Refs: - golang/go#19326 - golang/go#46321
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
terminateWithSignal()re-raises the original signal back to the process viasignal.Reset()+process.Signal()to mimic a signal-style exit (e.g.128+SIGTERM=143). Butsignal.Reset()does not installSIG_DFL, it only disables Go's user-channel routing. The runtime's signal trampoline is still installed and intercepts the raised signal, dispatching it toruntime.dieFromSignal():Multi-goroutine programs running as PID 1 in a PID namespace reliably hit the
exit(2)fall-through.bb_workerand other daemons exit code 2 on graceful SIGTERM despite a clean shutdown, pods end Failed/Error in k8s. The existingtime.Sleep + os.Exitfallback was unreachable code (andtime.Sleep(5)is 5 nanoseconds anyway).Approach
Skip the signal-raise dance and exit with the right code directly.
128+signalis what POSIX shells and init systems derive fromWIFSIGNALED, so it's user-visibly equivalent without going near Go's signal machinery.Per review feedback: exit
128+signalis correct for one-shot tools (bb_copy,sync_jwks_to_configmap) where signal interruption should leave a non-zero status, but wrong for daemons where graceful shutdown should look successful to the supervisor (k8s pod phaseSucceeded, systemd inactive (dead) without failure).Add a variadic
OptiontoRunMain:128+signal(correct for one-shot CLI tools)program.WithDaemonExit(), exit0(correct for long-running daemons)Migrate this repo's daemons (
bb_storage,bb_replicator) in the same PR; leavebb_copyandsync_jwks_to_configmapunchanged. Cross-repo daemons (bb_worker,bb_scheduler,bb_runnerinbb-remote-execution;bb_autoscalerelsewhere) pick the daemon variant matching their lifecycle when they bump the dependency.Refs